Exploring the Contextual Factors Affecting Multimodal Emotion Recognition in Videos

نویسندگان

چکیده

Emotional expressions form a key part of user behavior on today's digital platforms. While multimodal emotion recognition techniques are gaining research attention, there is lack deeper understanding how visual and non-visual features can be used to better recognize emotions in certain contexts, but not others. This study analyzes the interplay between effects derived from facial expressions, tone text conjunction with two contextual factors: 1) gender speaker, 2) duration emotional episode. Using large public dataset 2,176 manually annotated YouTube videos, we found that while consistently outperformed bimodal unimodal features, their performance varied significantly across different emotions, contexts. Multimodal performed particularly for male speakers recognizing most emotions. Furthermore, shorter than longer videos neutral happiness, sadness anger. These findings offer new insights towards development more context-aware empathetic systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimodal Emotion Recognition

Multimodal fusion is the process whereby two or more forms of input are gathered together in order to produce a higher overall classification accuracy than individual unimodal systems. This is a popular technique in emotion recognition. In this study, we attempted to discover how much we could improve upon individual unimodal systems using decision level fusion. To accomplish this, we acquired ...

متن کامل

Multimodal Emotion Recognition

Speech is the primary means of communication between human beings in their day-to-day interaction with one another. Speech, if confined in meaning as the explicit verbal content of what is spoken, does not by itself carry all the information that is conveyed during a typical conversation, but is in fact nuanced and supplemented by additional modalities of information, in the form of vocalized e...

متن کامل

Multimodal Emotion Recognition

Recent technological advances have enabled human users to interact with computers in ways previously unimaginable. Beyond the confines of the keyboard and mouse, new modalities for human-computer interaction such as voice, gesture, and force-feedback are emerging. Despite important advances, one necessary ingredient for natural interaction is still missing–emotions. Emotions play an important r...

متن کامل

Multimodal Emotion Recognition Using Multimodal Deep Learning

To enhance the performance of affective models and reduce the cost of acquiring physiological signals for real-world applications, we adopt multimodal deep learning approach to construct affective models from multiple physiological signals. For unimodal enhancement task, we indicate that the best recognition accuracy of 82.11% on SEED dataset is achieved with shared representations generated by...

متن کامل

MEMN: Multimodal Emotional Memory Network for Emotion Recognition in Dyadic Conversational Videos

Multimodal emotion recognition is a developing field of research which aims at detecting emotions in videos. For conversational videos, current methods mostly ignore the role of inter-speaker dependency relations while classifying emotions. In this paper, we address recognizing utterance-level emotions in dyadic conversations. We propose a deep neural framework, termed Multimodal Emotional Memo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Affective Computing

سال: 2023

ISSN: ['1949-3045', '2371-9850']

DOI: https://doi.org/10.1109/taffc.2021.3071503